Statistical Modelling for Outlier Factors

نویسنده

  • Ahmet Kaya
چکیده

Error in data is one of the facts that cause the parameter estimations to be subjective. If the erroneous case is proved statistically, then these cases are called outliers. Outliers are defined as the few observations or records which appear to be inconsistent with the rest of the group of the sample and more effective on prediction values. Isolated outliers may also have positive impact on the results of data analysis, data mining and estimated model. In this study, we are concerned with outliers in time series which have two special cases, innovational outlier (IO) and additive outlier (AO). The occurence of AO indicates that action is required, possibly to adjust the measuring instrument or mistake made by person in observation or record. However, if IO occurs, no adjustment of the measurement operation is required. Also in the study, a multi-factor ( 42 3 ) modelling was done in order to fit the effects of model in data analysis AR(1) coefficients, (0.5, 0.7, 0.9) outlier type (AO, IO), serie wideness (50, 100, 200, 500) and criterion value sensibility (% 99 (C=3.00), % 95 (C=3.50), % 90 (C=4.00)) factors statistically by making use of a simulation study. The results of the variance analysis on outlier factors were also emphasized.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A statistical test for outlier identification in data envelopment analysis

In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the prese...

متن کامل

Outlier detection for high-dimensional data

Outlier detection is an integral component of statistical modelling and estimation. For highdimensional data, classical methods based on the Mahalanobis distance are usually not applicable. We propose an outlier detection procedure that replaces the classical minimum covariance determinant estimator with a high-breakdown minimum diagonal product estimator. The cut-off value is obtained from the...

متن کامل

Outlier (Anomaly) Detection Modelling in PMML

PMML is an industry-standard XML-based open format for representing statistical and data mining models. Since PMML does not yet support outlier (anomaly) detection, in this paper we propose a new outlier detection model to foster interoperability in this emerging field. Our proposal is included in the PMML RoadMap for PMML 4.4. We demonstrate the proposed format on one supervised and two unsupe...

متن کامل

Refining Landscape Change Models through Outlier Analysis in the Muskegon Watershed of Michigan

Balancing natural resource protection and urban development is of concern to researchers, planners and citizens who are aware of the environmental, social and economic impacts of urban land use. Land-use change models can assist in finding this balance. An objective of this research was to build a better model of land-use change by integrating quantitative and qualitative techniques. A modellin...

متن کامل

Identification of Data Mining Techniques for Industrial Process Analysis and Control

This paper describes data mining techniques for application to industrial process analysis tasks. The aim was to identify data mining techniques that support exploratory analysis for performance assessment, process modelling, and fault diagnosis tasks that form the foundations of process control. The data mining technique base was investigated by conducting experiments using both synthetic and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010